-02 07:25pm From-HGS PATENT DEPT 



3013098504 



T-502 P. 21/34 F-745 



SEQUENCE LISTING 

(1) GENERAL INFORMATION ; 



(i) APPLICANT: Choi et. al- 

(ix) TITLE OF INVENTION: Streptococcus pneumoniae Antigens and 
Vaccines 

(iii) NUMBER OF SEQUENCES: 4 



(XV) CORRESPONDENCE ADDRESS; 

(A) ADDRESSEE - Human. Genome Sciences , Inc. 

(B) STREET s 9410 Key West Avenue 
(C> CITY: RocKville 

<D) STATE: Maryland 

(E> COUNTRY! USA 

(F) ZIP: 20850 



(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3,50 inch, I. 4Mb storage 

(B) COMPUTER; HP Vectra 486/33 

(C) OPERATING SYSTEM: MSDOS version 6.2 
ID) SOFTWARE: ASCII Text 



(vi) CURRENT APPLICATION DATA; 

(A) APPLICATION NUMBER: 08/961,083 
IB) FILING DAT£;OCT-30-1997 
<C) CLASSIFICATION: 



(Vii) PRIOR APPLICATION DATA: 

I A) APPLICATION NUMBER: 60/029 , 960 
(B) FILING DATE:OCT-31-1996 



(Viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Michelle S- Marks 

(B) REGISTRATION NUMBER : 41,971 

(C) REFERENCE /DOCKET NUMBER: PB340P2 



(vi) TELECOMMUNICATION INFORMATION; 

(A) TELEPHONE: (301) 309-B5Q4 

(B) TELEFAX: (301) 309-8512 



(2) INFORMATION FOR SEQ ID NO: Is 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2389 base pairs 

(B) TYPE; nucleic acxd 

(C) STRANDEDNESS = double 

(D) TOPOLOGY; linear 



(Xi) SEQUENCE DESCRIPTION : SEQ ID NO: 1; 
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TTCTTACGAG TTGGGACTGT ATCAAGCTAG AACGGTTAAG GAAAATAATC GTGTTTCCTA 60 

TATAGATGGA AAACAAGCGA CGCAAAAAAC GGAGAATTTG ACTCCTGATG AGGTTAGCAA 120 

GCGTGAAGGA ATCAATGQTG AGCAAATCGT CATCAAGATA ACAGACCAAG GCTATGTCAC 180 

TTCACATGGC GACCACTATC ATTATTACAA TGGTAAGGTT CCTTATGACG CTATCATCAG 2 40 

TGAAGAATTA CTCATGAAAG ATCCAAACTA TAAGCTAAAA GATGAGGATA TTGTTAATGA 3 00 

GGTCAAGGGT GGATATGTTA TCAAGGTAGA TGGAAAATAC TATGTTTACC TTAAGGATGC 360 

TGCCCACGCG GATAACGTCC GTACAAAAGA GGAAATCAAT CGACAAAAAC AAGAGCATAG 4 20 

TCAACATCGT GAAGGTGGAA CTCCAAGAAA CGATGGTGCT GTTGCCTTGG CACGTTCGCA 4 80 

AGGACGCTAT ACTACAGATG ATGGTTATAT CTOTAATGCT TCTGATATCA TAGAGGATAC 54 0 

TGGTGATGCT TATATCGTTC CTCATGGAGA TCATTACCAT TACATTCCTA AGAATGAGTT 60 0 

ATCAGCTAGC GAGTTGGCTG CTGCAGAAGC CTTCCTATCT GGTCGAGGAA ATCTGTCAAA 66 0 

TTCAAGAACC TATCGCCGAC AAAATAGCGA TAACACTTCA AGAACAAACT GGGTACCTTC 720 

TGTAAGCAAT CCAGGAACTA CAAATACTAA CACAAGCAAC AACAGCAACA CTAACAGTCA 780 

AGCAAGTCAA AGTAATGACA TOGATAGTCT CTTGAAACAG CTCTACAAAC TGCCTTTGAG 840 

TCAACGACAT GTAGAATCTG ATGGCCTTG? CTTTGATCCA GCACAAATCA CAAGTCGAAC 900 

AGCTAGAGGT GTTGCAGTGC CACACGGAGA TCATTACCAC TTCATCCCTT AGTCTCAAAT 960 

GTCTGAATTG GAAGAACGAA TCGCTCGTAT TATTCCCCTT CGTTATCGTT CAAACCATTG 1020 

GGTACCAGAT TCAAGGCCAG AACAACCAAG TCCACAACCG ACTCCGGAAC CTAGTCCAGG 1080 

CCCGCAACCT GCACCAAATC TTAAAATAGA CTCAAATTCT TCTTTGGTTA GTCAGCTGGT 1140 

ACGAAAAGTT GGGGAAGGAT ATOTATTCGA AGAAAAGGGC ATCTCTCGTT ATGTCTTTGC 1200 

GAAAGATTTA CCATCTGAAA CTGTTAAAAA TCTTGAAAGC AAGTTATCAA AACAAGAGAG 1260 

TGTTTCACAC ACTTTAACTG CTAAAAAAGA AAATGTTGCT CCTCGTGACC AAGAATTTTA 132 0 

TGATAAAGCA TATAATCTGT TAACTGAGGC TCATAAAGCC TTGTTTGNAA ATAAGGGTCG 1380 

TAATTCTGAT TTCCAAGCCT TAGACAAATT ATTAGAACGC TTGAATGATG AATCGACTAA 144 0 

TAAAGAAAAA TTGGTAGATG ATTTATTGGC ATTCCTAGCA CCAATTACCC ATCCAGAGCG 1500 

ACTTGGCAAA CCAAATTCTC AAATTGAGTA TACTGAAGAC GAAGTTCGTA TTGCTCAATT 1560 

AGCTGATAAG TATACAACGT CAGATGGTTA CATTTTTGAT GAACATGATA TAATCAGTGA 1620 

TGAAGGAGAT GCATATGTAA CGCCTCATAT GGGCCATAGT CACTGGATTG GAAAAGATAG 1680 

CCTTTCTGAT AAGGAAAAAG TTGCAGCTCA AGCCTATACT AAAGAAAAAG GTATCCTACC 1740 

XCCATCTCCA GACGCAGATG TTAAAGCAAA TCCAACTGGA GATAGTGCAG CAGCTATTTA 1800 

CAATCGTGTG AAAGGGGAAA AACGAATTCC ACTCGTTCGA CTTCCATATA TGGTTGA6CA I8 60 

TACAGTTGAG GTTAAAAACG GTAATTTGAT TATTCCTCAT AAGGATCAM ACCATAATAT 1920 
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TAAATTTGCT 


TGGTTTGATG 


ATCACACATA 


CAAAGCTCCA 


AATGGC TATA 


CCTTGGAAGA 


1980 


TTTGTTTGCG 


ACGATTAAGT 


ACTACGTAGA 


ACACCCT6AC 


GAACGTCCAC 


ATTCTAATGA 


2040 


TGGATGGGGC 


AATGCCAGTG 


AGCATGTGTT 


AGGCAAGAAA 


GACCACAGTG 


AAGATCCAAA 


2100 


TAAGAACTTC 


AAAGCGGATG 


AAGAGCCAGT 


AGAGGAAACA 


CCTGCTGAGC 


CAGAAGTCCC 


2160 


TCAAGTAGAG 


ACTGAAAAAG 


TAGAAGCCCA 


ACTCAAAGAA 


GCAGAAGTTT 


TGCTTGCGAA 


2220 


AGTAACGGAT 


TCTAGTCTGA 


AAGCCAATGC 


AACAGAAACT 


CTAGCTGGTT 


TACGAAATAA 


2280 


TTTGACTCTT 


CAAATTATGG 


ATAACAATAG 


TATCATGGCA 


GAAGCAGAAA 


AATTACTTGC 


2340 


GTTGTTAAAA GGAAGTAATC 


CTTCATCTGT 


AAGTAAGGAA 


AAAATAAAC 




2389 



(2) INFORMATION FOR SEQ 10 NO: 2; 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH; 796 amino acids 

(B) TYPE = amino acid 

(C) STRANDEDNESS; single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Ser Tyr Glu Leu Gly Leu Tyr Gin Ala Arg Thr val Lys Glu Asn Asn 
15 10 15 

Arg Val Ser Tyr lie Asp Gly, Lys Gin Ala Thr Gin Lys Thr Glu Asn 
20 25 30 

Leu Thr Pro Asp Glu Val Ser Lys Arg Glu Gly lie Asn Ala Glu Gin 
35 40 45 

He Val He Lys He Thr Asp Gin Gly Tyr val Thr Ser His Gly Asp 
50 55 60 

His Tyr His Tyr Tyr Asn Gly Lys val Pro Tyr Asp Ala He He Ser 
65 70 75 80 

Glu Glu Leu Leu Met Lys Asp Pro Asn Tyr Lys Leu Lys Asp Glu Asp 
85 90 95 

He val Asn Glu Val Lys Gly Gly Tyr Val He Lys Val Asp Gly Lys 
100 105 110 

Tyr Tyr Val Tyr Leu Lys Asp Ala Ala His Ala Asp Asn val Arg Thr 
115 120 125 

Lys Glu Glu He Asn Arg Gin Lys Gin Glu His Ser Gin His Arg Glu 
130 135 140 

Gly Gly Thr Pro Arg Asn Asp Gly Ala val Ala Leu Ala Arg Ser Gin 
145 150 155 160 

Gly Arg Tyr Thr Thr Asp Asp Gly Tyr He Phe Asn Ala ser Asp He 
165 170 175 
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51 



lie Glu Asp Thr GXy Asp Ala Tyr He Val Pro His Gly Asp His Tyr 
180 185 190 

His Tyr He Pro Lys Asn Glu Leu Ser Ala Ser Glu Leu Ala Ala Ala 
195 200 205 

Glu Ala Phe Leu Ser Gly Arg Gly Asn Leu Ser Asn Ser Arg Thr Tyr 
210 215 220 

Arg Arg Gin Asn Ser Asp Asn Thr Ser Arg Thr Asn Trp Val Pro Ser 
225 230 235 240 

Val ser Asn Pro Gly Thr Thr Asn Thr Asn Thr Ser Asn Asn Ser Asn 
245 250 255 

Thr Asn Ser Gin Ala Ser Gin Ser Asn Asp He Asp Ser £eu Leu Lys 
260 265 270 

Gin Leu Tyr Lys Leu Pro Leu Ser Gin Arg His val Glu Ser Asp Gly 
275 280 285 

Leu Val Phe Asp Pro Ala Gin He Thr Ser Arg Thr Ala Arg Gly Val 
290 295 300 

Ala Val Pro His Gly Asp His Tyr His Phe He Pro Tyr Ser Gin Met 
305 310 315 320 

Ser Glu Leu Glu Glu Arg He Ala Arg He He Pro Leu Arg Tyr Arg 
325 330 335 

Ser Asn His Trp Val Pro Asp Ser Arg Pro Glu Gin Pro Ser Pro Gin 
340 345 350 

Pro Thr Pro Glu Pro Ser Pro Gly Pro Gin Pro Ala Pro Asn Leu Lys 
355 360 365 

He Asp Ser Asn Ser Ser Leu val Ser Gin Leu Val Arg Lys Val Gly 
370 375 380 

Glu Gly Tyr Val Phe Glu Glu Lys Gly He Ser Arg Tyr val Phe Ala 
385 390 395 400 

Lys Asp Leu Pro Ser Glu Thr val Lys Asn Leu Glu Ser Lys Leu Ser 
405 410 415 

Lys Gin Glu Ser Val Ser His Thr Leu Thr Ala Lys Lys Glu Asn val 
420 425 430 

Ala Pro Arg Asp Gin Glu Phe Tyr Asp Lys Ala Tyr Asn Leu Leu Thr 
435 440 445 

Glu Ala His Lys Ala Leu Phe Xaa Asn Lys Gly Arg Asn Ser Asp Phe 
450 455 460 

Gin Ala Leu Asp Lys Leu Leu Glu Arg Leu Asn Asp Glu Ser Thr Asn 
465 470 475 480 

Lys Glu Lys Leu Val Asp Asp Leu Leu Ala Phe Leu Ala pro lie Thr 
485 490 495 

His Pro Glu Arg Leu Gly Lys Pro Asn Ser Gin He Glu Tyr Thr Glu 
500 505 510 
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Asp Glu val Arg He Ala Gin Leu Ala Asp Lys Tyr Thr Thr Ser Asp 
515 520 525 

Gly Tyr He Phe Asp Glu His Asp He He Ser Asp Glu Gly Asp Ala 
530 535 540 

Tyr Val Thr Pro Bis Her Gly His Ser His Trp He Gly Lys Asp Ser 
545 550 5,55 560 

Leu ser Asp Lys Glu Lys Val Ala Ala Gin Ala Tyr Thr Lys Glu Lys 
565 570 575 

Gly He Leu Pro pro Ser Pro Asp Ala Asp Val Lys Ala Asn Pro Thr 
580 585 590 

Gly Asp Ser Ala Ala Ala He Tyr Asn Arg Val Lys Gly Glu Lys Arg 
595 600 605 

lie Pro 3L«u Val Arg Leu Pro Tyr Met Val Glu His Thr Val Glu Val 
610 615 620 

Lys Asn Gly Asn Leu He He Pro Sis Lys Asp His Tyr His Asn He 
625 630 635 640 

Lys Phe Ala Trp Phe Asp Asp His Thr Tyr Lys Ala Pro Asn Gly Tyr 
645 650 655 

Thr Leu Glu Asp Leu Phe Ala Thr He Lys Tyr Tyr Val Glu His Pro 
660 665 670 

Asp Glu Arg Pro His Ser Asn Asp Gly Trp Gly Asn Ala Ser Glu His 
675 680 685 

Val Leu Gly Lys Lys Asp His Ser Glu Asp Pro Asn Lys Asn Phe Lys 
690 695 700 

Ala Asp Glu Glu Pro Val Glu Glu Thr Pro Ala Glu Pro Glu Val Pro 
705 710 715 720 

Gin Val Glu Thr Glu Lys Val Glu Ala Gin Leu Lys Glu Ala Glu Val 
725 730 735 

Leu Leu Ala Lys val Thr Asp ser Ser Leu Lys Ala Asn Ala Thr Glu 
740 745 750 

Thr Leu Ala Gly Leu Arg Asn Asn Leu Thr Leu Gin He Met Asp Asn 
755 760 765 

Asn Ser He Met Ala Glu Ala Glu Lys Leu Leu Ala Leu Leu Lys Gly 
770 775 780 

ser Asn Pro Ser Ser val Ser Lys Glu Lys Tie Asn 
785 790 795 



(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 
IA> LENGTH; 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEPNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
AGTCGGATCC TTCTTACGAG TTGGGACTGT ATCAAGC 37 

(2) INFORMATION FOR SEQ ID NO; 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 40 fc>*ise pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOI-OGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO = 4: 
AGTCAACCtfP GTTTATTTTT T CC T T AC TTA CAGATGAAGG 



40 



